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Abstract — We show the equivalence of two different no- 
tions of quantum channel capacity: that which uses the en- 
tanglement fidelity as its criterion for success in transmis- 
sion, and that which uses the minimum fidelity of pure states 
in a subspace of the input Hilbert space as its criterion. As 
a corollary, any source with entropy less than the capacity 
may be transmitted with high entanglement fidelity. We 
also show that a restricted class of encodings is sufficient to 
transmit any quantum source which may be transmitted on 
a given channel. This enables us to simplify a known upper 
bound for the channel capacity. It also enables us to show 
that the availability of an auxiliary classical channel from 
encoder to decoder does not increase the quantum capacity. 

Keywords — Channel capacity, Quantum channels, Quan- 
tum information. 
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I. Introduction 

A theory of quantum information is emerging which 
shows striking parallels with, but also fascinating differ- 
ences from, classical information theory. One of the prin- 
cipal concerns of such theories is the capacity of a noisy 
channel for transmitting the state of a system despite some 
uncertainty about that state; that is, for rendering the state 
of some other system virtually identical to the initial state 
of the system at hand. In classical information theory, 
this is one of a set of mutually exclusive classical states; 
in quantum mechanics, a quantum state represented by a 
vector in a Hilbert space, or a density operator on that 
space. Classically, the input system may retain its original 
state, while the no-cloning theorem and related results Q, 
§, 1,1, 1, !,§, §,#,0L 0,11 imply that in the 
quantum case the input system cannot in general remain 
in its initial state. Both theories allow the use of encod- 
ing and decoding operations to increase the fidelity with 
which states are transmitted. Due partly to the peculiarly 
quantum fact that a system's state may be entangled with 
that of other systems, a greater variety of definitions of 
capacity has arisen in quantum mechanics, depending, for 
example, on whether the entanglement of a system with 
some reference system is required to be preserved by the 
transmission process, or not. Here we concentrate on two 
notions of quantum capacity, one investigated for example 
in Jl3| ] , jwj , |^| . ]l6| ] , concerned with the maximum entropy 
of a density operator whose entanglement with a reference 
system which does not undergo the noise process can be 
preserved with high fidelity, and another arising for exam- 
ple in @,||0| and concerned with the maximum 
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size of a Hilbert space all of whose pure states can be pre- 
served with high fidelity. We show that these two defini- 
tions of capacity are in fact equivalent, in the situation in 
which sources are required to satisfy the quantum analogue 
of the asymptotic equipartition principle. We also show 
that any source with entropy less than the capacity may 
be sent with high entanglement fidelity, so that quantum 
entropy and capacity parallel classical entropy and capacity 
in this respect. 

We also establish that any source that may be transmit- 
ted may be transmitted using only a maximal partial isom- 
etry as an encoding. This can be interpreted as meaning 
that encoding can be a unitary process, except for an ini- 
tial projection of the source onto a subspace small enough 
to fit into the channel, if the channel is smaller than the 
source. 

This fact, which is in some ways analogous to the source- 
channel coding separation theorem of classical information 
theory, allows us to simplify a known upper bound on the 
quantum channel capacity, by removing from the expres- 
sion a maximization over encodings, confirming an earlier 
conjecture. The conjecture has also been confirmed by pl[|, 
but the result that any source that may be transmitted 
may be transmitted using partially isometric encodings is 
slightly stronger than that obtained in 

In ||^] Adami and Cerf express the view that 'Whether a 
capacity can be defined consistently that characterizes the 
"purely" quantum component of a channel is an open ques- 
tion." In our view, the pure-state capacity defined below 
and in earlier papers is just such a consistently defined ca- 
pacity, and the result that any source with entropy less than 
the entanglement capacity of a channel may be transmitted 
with high entanglement fidelity removes the last possible 
objection to the capacity for entanglement transmission as 
another such notion of "purely quantum" capacity. 

We note that besides those cited above, many authors 
have worked on the problem of quantum information trans- 
mission through quantum channels; some of this work cal- 
culates or or bounds the capacity we study here, for par- 
ticular channels or classes of channels: an incomplete list 
that could serve as an entry to the literature includes |l7j , 
@,@,@,@,@,§,j£i. Some of the extensive lit- 
erature on the more algebraic approach to quantum coding 
also yields information about the quantum capacity. 

II. Quantum sources and channel capacity 

A. Mathematical preliminaries and notation 

The effect of encoding procedures, decoding procedures, 
and noisy quantum channels on the state of a system may 
be described by completely positive linear maps A/", from 
the space B(H C ) of bounded linear operators on a input 
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Hilbert space H c , to the space B(H a ) of bounded lin- 
ear operators on an output Hilbert space H Q |25}| . [po| , |27| . 
In this paper, we consider only discrete channels, which 
we define as having finite-dimensional input and output 
Hilbert spaces (the word "bounded" in the specification of 
the input and output spaces is redundant in the discrete 
case). We will sometimes use the term quantum operation 
for a trace-nonincreasing completely positive map. Such 
maps have representations in terms of linear operators Ai 



A(p) = J2 A *P A l> 



with 



E4a 



< i 



(i) 



(2) 



equality holds in the latter when the map is trace- 
preserving. We call the set {Ai} an operator decompo- 
sition, or simply decomposition, of the operation A, and 
sometimes write: 



(3) 



to indicate that {A{\ is an operator decomposition of A. 
Any two decompositions of the same operation, {Ai} hav- 
ing r operators and {Bi} having s < r operators, are re- 
lated by pT]]: 



m 



(4) 



where m is the matrix of a maximal partial isometry from 
the complex vector space C s to C . A partial isometry is 
a generalization of a unitary operator, which must satisfy 
VV^ = n for some projector II. Such an isometry will then 
also satisfy V'V = T for some projector T having the same 
dimensionality as II. If the range and domain spaces of a 
linear operator V have different dimensionality, it will not 
be possible to find a unitary mapping between the two: the 
best one can do is find a partial isometry V such that one 
of VV' and V'V is the identity (whichever one operates 
on the smaller space). We will call such a map a maximal 
partial isometry between the spaces Si and S2. A partial 
isometry with VV' having dimension C may be thought of 
as projecting onto a C-dimensional subspace of V's domain 
Hilbert space and then mapping that subspace unitarily to 
a C-dimensional subspace of the range Hilbert space. Thus 
if s < r in W, rn's columns are s orthonormal vectors in C r : 



E 



or in other words: 



rriijTnkj 



it = /(•) 



(5) 



(6) 



use the roman letter A to denote the operation A as well as 
the operator A when no confusion will result. We note that 
care is needed when the operator includes a scalar factor 
z: thus if A ~ {A} while B ~ {z^4.}, we may also refer to 
the operation B as either zA or |z| 2 „4. 

We write A£ for the operation of £ followed by A; thus 
A£{p)=A{£{p)). 

Any quantum operation on a system Q may be real- 
ized 28 1 , , |2(| by a "unitary representation" in which 
the Hilbert space Q is extended by adjoining an environ- 
ment E prepared in a standard state \Q E ), and the sys- 
tem and environment undergo a unitary interaction, fol- 
lowed by a projection on the environment system. Any 
such unitary interaction with a given initial environment 
state determines a quantum operation. (In the case of a 
trace-preserving operation, the environment projection is 
the identity.) That is, 



A(p) = tr E (ir E U<J E \0 E )(0 E \ ® pQU^V) 



(7) 



The operators A4 in the operator decomposition represen- 
tation discussed above, turn out to be the "operator matrix 
elements" 



(i E \U^ E \O l 



(8) 



of the unitary interaction, between the initial environment 
state and orthonormal environment vectors vectors \i) of 
the basis used for the partial trace over the environment. 
The freedom (|4|) to "unitarily remix" the operators Ai , ob- 
taining another valid decomposition, is just the freedom to 
do the enviroment partial trace in a different environment 
basis (related to the first by that same unitary). 

B. Transmission and capacity 

We now review the problem of entanglement transmis- 
sion, as discussed more fully in |lgjl , A fuller dis- 
cussion of the problem may be found in those articles. Here 
the goal is to use block coding to send the density operator 
of a source in a manner which preserves its entanglement 
with whatever reference system it may be entangled with. 
We imagine the density operator p® of our quantum sys- 
tem to arise from a pure state on a larger composite system 
RQ, by tracing out the "reference" system R. That is, 



trfldV^XV^I) 



(9) 



For R with dimension at least as great as that of p^'s sup- 
port, such purifications always exist; different purifications 
of the same p® are related by unitary transformations on 
R. We define the entanglement fidelity as 



F e (p Q ,A) = 0j rq \1 (g> A(\^ RQ )(^ RQ \M 



RQ\ 



(10) 



the matrix element of the final, noise-affected state of the 
system RQ, with the initial state \ip ^). This is easily 
shown to be independent of which purification \%p R Q) is 
used, and to have the form: 



Sometimes an operation A will have a decomposition 
consisting of a single operator A; in this case, we will often 



(11) 
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Note that while [[16| defined F e as the renormalized en- 
tanglement fidelity ^ |tr Aip\ 2 /tr A(p), we have omitted 
the normalization, since the unrenormalized version is most 
useful in the present context. When we need the renor- 
malized entanglement fidelity, just defined, we will use the 
symbol F e . 

We define a quantum source £ = (H s , T) to consist of a 

o [n) 1 



Hilbert space H s and a sequence T = {pi 1 ' , pi 2 ' , . ; 



where ps is a density operator on H s , pi 2 ' a density op- 
erator on H s ® H s , and pi™' a density operator on Hf n , 
etcetera. We define the entropy rate of a source £ as 



S(T,) = limsup 



(12) 



(Sometimes we use the term "entropy of a source" to mean 
its entropy rate.) A quantum channel will be a trace- 
preserving map 



Af : B{H C ) -> B{H ) 



(13) 



from operators over a channel input space H c to operators 
over a channel output space H a . A coding scheme for a 
given source into a given channel consists of a sequence 
(£(»), X>(")) of trace -preserving encoding maps and decod- 
ing maps 



: B(Hf n ) -> B(Hf n ) 



(14) 



We say that a source S may be sent reliably over a quantum 
channel Af if there exists a coding scheme such that 



lim F e (p^ l \V^Af® n £ (n) ) = 1 . 



(15) 



We say that rate R is achievable with a quantum channel 
Af if there is a source £ with entropy R which may be sent 
reliably over the channel. We define the quantum capacity 
of the channel for transmission of entanglement, Q e (Af), 
as the supremum of rates achievable with the channel Af. 
This definition of channel capacity leaves open the pos- 
sibility that although some sources with entropy close to 
the capacity can be sent reliably, not all such sources can. 
Classically, it turns out that this is not the case: any source 
with entropy less than the classical capacity may be sent 
reliably. In what follows, we will establish that this is also 
the case for the quantum capacity. We will also establish 
the equality of the capacity for entanglement transmission 
Q e , with the capacity for transmission of pure states in a 
subspace, Q s , used for example in flj9f , flg| . We define the 
minimum pure-state fidelity, or simply pure-state fidelity, 
of a subspace H of the channel input Hilbert space as 



F P (H,A) = min| V , )eif (^|.A(|^|M 



(16) 



We say the rate R of transmission of subspace dimensions 
is achievable with channel Af if there exists a sequence of 
subspaces of Hf n such that 



lim sup 



log dim(H 



R 



(17) 



and there is a coding scheme which sends it reliably in the 
sense that 



lim F p {H^ n \V^N® n £ (n) ) = 1. 



(18) 



We define the capacity of the channel Af for transmission 
of subspaces, Q s , as the supremum of achievable rates of 
transmission of subspace dimensions with channel Af. 

C. The Quantum Asymptotic Equipartition Property 

The e-typical subspace for an rt-block of material p(™' 
produced by a quantum source £ on a Hilbert space H is 
defined to be the subspace Ti™' of H® n spanned by the 
eigenvectors |A) of pW whose eigenvalues A satisfy: 

2 -n(S(E)+e) < x< 2-™ (s(s) - e) . (19) 



An equivalent requirement is: 



1 



■log A -£(£)| < e. 



(20) 



The definition derives its interest from the fact that for 
some interesting sources — for example, the i.i.d. source 
with p(") = p® ra (2^] — all but a negligible portion of the 
source becomes concentrated in an e-typical subspace as n 
goes to infinity, no matter how small e is chosen to be. More 
formally, the i.i.d. source satisfies the Quantum Asymptotic 
Equipartition Property (QAEP). (Here and elsewhere, we 
will sometimes use the phrase "for large enough n, P(n) is 
true" to mean "there exists an uq such that for all n > no, 
P(n) is true".) 

Definition 1: A source £ = {p«,...,p("\...} is said to 
satisfy the Quantum Asymptotic Equipartition Property if 
for any positive e and S, for large enough n the e-typical 
subspace of p( n ' satisfies: 



trA (")p(")A(™' > 1-5, 



(21) 



where A^™' is the projector onto Ti 1 . 

An immediate consequence of satisfaction of the QAEP 
is the following bound on the the dimension of the typical 
subspace, which holds for n large enough that the trace 
bound in the QAEP is satisfied: 



(1 - S)2 nis ^-^ < Himfrf*)) < 2™( s ( s )+ e ' 



(22) 



A slightly more involved consequence is that for large 
enough n no subspace of dimension smaller than the lower 
bound (1 — S)2 n ^ s ^~ e > on the size of the typical subspace, 
has probability greater than 5. That is, if n is the projector 
onto such a space, 



tr np < 5 



(23) 



See II . 

The classical Shannon-McMillan-Breiman theorem states 
that all stationary ergodic classical sources satisfy the 
(classical) AEP; however, these are not necessarily all the 
sources which satisfy it. There is as yet no known quantum 
analogue of the Shannon-McMillan-Breiman theorem, pro- 
viding a broad and natural class of sources satisfying the 
QAEP, although there has been work in this direction pit] . 
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III. Useful facts about fidelities 

A. Convexity of Entanglement Fidelity in the Input Den- 
sity Operator 

Lemma 1: The entanglement fidelity is convex in the in- 
put density operator, 

F e {\ Pl + (1- X)p 2 ),£) < XF e ( Pl ,£) + (l-X)F e (p 2 ,£) . 

(24) 

Proof: Note that the entanglement fidelity may be viewed 
as the squared norm ||a|| 2 = J2i \ a i\ 2 °f a complex vector 
a whose components are: 

ai = tr Aipi . (25) 

Then, letting also 

b t = tr A lP2 , (26) 

the entanglement fidelity of the convex combination of p\ 
and p2 may be written 

F e (\ Pl + (l-\)p 2 ,£) = ||Aa+(I-A)b|| 2 . (27) 

Any norm is easily shown to be convex (see e.g. |52| for 
real vector spaces), and since a norm is positive its square 
is also convex and the lemma follows. ■ 

Note that with this representation of the entanglement 
fidelity, the freedom to choose an environment basis (equiv- 
alently, the freedom to move to a different operator decom- 
position of a given operation) corresponds to performing a 
maximal partial isometry V from the complex vector space 
containing a to another complex vector space (with dimen- 
sion equal to the number of operators in the new decompo- 
sition). (Since the transformation is length-preserving, it 
preserves (as it had better!) the entanglement fidelity.) We 
may use this "unitary" freedom to transform the vector a 
into one of the same length with only a particular compo- 
nent, say the first, nonzero. Then the entanglement fidelity 
will just be the modulus |tr Aip\ 2 of that component. This 
gives us a useful lemma: 

Lemma 2: There exists an operator sum decomposition 
{A,} of A such that F e (p,A) = F e (p, A{). 

It may be instructive to see how this result arises in the 
RQE or unitary view of operations. The entanglement 
fidelity is the fidelity of p R ® and the initial state of RQ: 
this is equal to the squared inner product of |0 E ) \ip R Q) with 
some purification of p R ® . The final pure state of RQE is 
such a purification, so it is related to the one whose inner 
product with the initial state gives the fidelity by a unitary 
on the environment; view this inner product as one between 
the final state of RQE and some other tensor product state 
W \Q ) \ip R ®}, and the result follows (since the individual 
terms in the entanglement fidelity correspond to particular 
states in an orthonormal basis used for the trace over the 
environment). 

A slight variant of this interpretation is useful in the 
proof of the next lemma. Write the entanglement fidelity 



as 

trn Q (p RQ '\^ RQ ){i> RQ \) 
= i* rqe{U qe \O e )W rq ) {^\{Q e \U^ 

y.{\^ R Q){^ R Q\®I E )) 
= \\{\ip RQ ){4> RQ \®I E )U QE \i) RQ )\0 E )\\ 2 . (28) 

That is, the entanglement fidelity is just squared length of 
the projection \n R< ^ E ) of the evolved pure state of RQE 
onto the tensor product of the environment and the one 
dimensional subspace of RQ spanned by the initial state 
of RQ. The vector a above is in fact just this projection; 
the components of a are the individual terms in the en- 
tanglement fidelity in a particular operator decomposition, 
i.e. the components of the vector a in a particular or- 
thonormal basis. These correspond to the components of 
the projection \ir R Q E ') in a particular orthonormal basis 
\xf) |"0 ) f° r the subspace onto which we have projected, 
which corresponds to a choice of orthonormal basis \xf) 
for the environment. So the lemma above is nothing but 
the observation that if we do the trace (in the definition 
of the entanglement fidelity) in an environment basis the 
first vector of which is a normalized version of \tt r Q e ), 
we only get one term, which is of course the length of this 
projection. 

We use this point of view to derive a lemma which con- 
cerns applying operations in sequence: if an operation has 
high fidelity, then the fidelity of the operation consisting of 
that operation followed by a second operation, is close to 
the fidelity of the second operation alone. 

Lemma 3: If F e (p,£) > 1 — r) then for trace- 
nonincreasing A, 

\F e (p,A£)-F e (p,A)\ < 277 ■ (29) 
Proof: Let El and El be environments inducing the op- 
erations £ and A through unitary interactions U^ E1 and 
yQEi respectively. Then: 

l-?7 < F e (p,£) 

= 1 1 (<p RQ I (8 I E1 U^ E1 1 v flQ ) |o £1 ) 1 1 2 
= {i> RQ I (o E2 1 (x E1 1 u® E1 \0 E1 ) \0 E2 ) \^) 

(30) 

for some Ix' 51 )- That is, the two vectors 
W E1 \Q E1 )\Q E2 )\i> R Q) and \ X E1 )\0 E2 )\^ RQ } are close. Now 
consider the two fidelities the magnitude of whose differ- 
ence we wish to bound; these may be written as the squared 
lengths of projections of the two close vectors just consid- 
ered. That is, define 

P= \^){^\®I E1E2 . (31) 

Then 

F e (p,A£) 
= \\PV^ E2 U^ E1 \^)\0 E1 )\0 E2 )\\ 2 
= \\(V^ QE2 PV QE2 )U QE1 \i, RQ )\0 E1 )\0 E2 )\\ 2 (32) 



SUBMITTED TO IEEE TRANSACTIONS ON INFORMATION THEORY 



and 



F e (p,A) = \\PV QE2 U QE1 \^ RQ )\0 E1 )\0 E2 )\\ 2 
= \\(V^ E2 PV^ E2 )\^)\ X m )\0 E2 ) 



(33) 



From elementary geometry, if for normalized |1) and 
|2), |(1|2)| 2 = 1 — 77 t nen for an Y projector P, |(1|P|1> - 
(2|P|2)| < 2r] . This may be applied directly to obtain the 
lemma. ■ 

A very simple but useful lemma implies that if two oper- 
ations have high entanglement fidelity on the same density 
operator, the final density operators have high fidelity with 
each other. The notion of fidelity used here is treated in 
p|, pi, and pi. It may be defined by 



F(pi,p 2 ) = max K^IVa)! 2 , 



(34) 



where are purifications of pi. 

In terms of this fidelity, the lemma is: 

Lemma 4'- If A,B are trace-preserving and F e (p,A) > 
1-ei and F e (p,B) > l-e 2 then F(A(p), B(p)) > l-ei-e 2 . 

Proof: Note that if (1|1) = (2|2) = 1, |(1|2)| 2 > 1 - a 
and |(1|3)| 2 > l-e 2 , then |(2|3)| 2 > l-e x -e 2 . Apply this 
with 1 2) and |3) being the purifications of A(p) and B(p) 
whose squared inner products with a purification |1) of p 
give the entanglement fidelities, obtaining 



|(2|3)| 2 >l- ei -e 2 



(35) 



Since the fidelity is the maximum squared inner product 
of purifications, F(A(p), B{p)) > |(2|3)| 2 > 1 - ei - e 2 , as 
claimed. ■ 

B. Continuity of entanglement fidelity in the input opera- 
tor 

We will also need the continuity lemma for entanglement 
fidelity, trivially extended to the case of unnormalized F e 
from M|. 

Lemma 5: 

\F e (B + A,A)-F e (B,A)\ < (tr(|A|)) 2 + 2tr(|A|), (36) 

where |A| = VAtA. 

C. Continuity of entropies in fidelities 

Here we will establish a quantitative statement of the 
continuity of the entropy as a function of the density oper- 
ator, in terms of the fidelity of neighboring density opera- 
tors. 

Lemma 6: For any density operators p±, p 2 , acting on a 
rf-dimensional Hilbert space, 



\S(pi) - S( P2 )\ < 2y/l -F(p 1 ,p 2 )logd+ 1 (37) 



when 



2y/l-F{ Pl , P2 ) < I 



(38) 



Proof: The proof begins with an inequality due to 
Fannes |36f| , involving an "error" quantity different from 
1 — F(px, p 2 ). Defining the L\ norm of an operator A as 



tr \A\ = tr VATa, (39) 
-xlogx, we have (when 



and the function rj(-) by r/(x) 
WP1-P2W < §) 



\S( Pl ) - S(p 2 )\ < llpi-^Hlogd+jjdlpi-^H) . (40) 

For our purposes, we may note that for x < ^, T](x) < 
< i ; anc | use th e weaker inequality 



\S( Pl )-S( P2 )\ < llpx-^Hlogd + l 



(41) 



Defining p' 1 ',^' 2 ' to be probability distributions given by 
the eigenvalues of p\ and p 2 respectively, we note that 
if the two density matrices commute, then \\p\ — p 2 \\ = 
Sdxip^TP^), where d^ is the Kolmogorov distance or 
total variation distance between two probability distribu- 
tions, 



5 Eld 



(1) (2) I 

Pi 



(42) 



Since the entropy difference is invariant under independent 
unitary rotations of each density matrix, 



\S(pi) - S( P2 )\ < 2d K { P ^\pW) \ogd + 1, 



(43) 



where we may take the eigenvalues to be arranged in order 
of size in both probability distributions. An inequality of 
C. H. Kraft M || implies 



d K (p^,p^) < ^l-B(pd\pW) , (44) 
where B is the Bhattacharyya-Wootters overlap 

(45) 



Moreover, 



B(p^,p^)>F(p uP2 ), 



(46) 



since, given the eigenvalues of both density operators, the 
fidelity is maximized by choosing their eigenvectors to be 
the same, assigned to eigenvalues in order of size. This 
follows easily from |}9]],[^(J and the representation of the 
square root of the fidelity as 



, 1/2 i/ 2rr 
max tr p x p 2 U 

unitary U 



(47) 



This completes the proof of the lemma. ■ 
Now consider the situation where d is the dimension of 
each of two spaces Q and R, and p R ® a density operator on 
the d 2 -dimensional space R®Q. We use the notation = 
trii p R ® , Using Lemma || and the fact that F(pf \p 2 ) > 
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F(fii i P2 )> one easn Y obtains a continuity relation for 
the entropy of Q conditional on R, defined as 



S(Q\R) = S(p H ^) - S(p^) . 
Lemma 7: Continuity of conditional entropy. 



(48) 



typical subspace. Let A be the projector onto the typical 
subspace after n uses of the source, and A the projector 
onto the orthogonal subspace. For any positive e and large 
enough n, 



tr(A>( n >A) < 



(52) 



\S{Qx\Rx) - S{Q 2 \R 2 )\ < 6a/1 - F{p[ 



p* Q )\agd + 2 



Defining the renormalized restriction of the source to the 
typical subspace, 



when F(pf Q , p* Q ) > 5/9. 

This lemma will be useful in the discussion of capacity 
below, because the capacity is bounded by a quantity, the 
coherent information of a density operator p® under an 
operation £, which may be written in terms of a conditional 
entropy. This quantity is defined by 



A 



(n) = _ A ^ A 



tr(Ap( n) A) : 



(53) 



I C ( P Q,£) = S 



£(P Q ) \ 
tx£{pQ)) 



S 



1®£{\^){iP r »\) 
tr £{p) 



(49) 



and applying the continuity lemma for entanglement fi- 
delity, (Js[) , we have the following lemma: 

Lemma 9: For any trace-preserving operation £ and any 
source satisfying the QAEP, 



IV. THE TYPICAL SUBSPACE AND ENTANGLEMENT 
FIDELITY 

We now derive some interesting implications of the 
QAEP for entanglement fidelity. These may be summa- 
rized by the statement that in order for the entanglement 
fidelity to be asymptotically high, it is necessary and suffi- 
cient that the fidelity be high on the typical subspace. We 
will demonstrate two versions of this statement, both of 
which will be used later on. 

Define a quantum data compression scheme for a source 
£ = (H s , pi™^ ) to be a sequence of trace-preserving quan- 
tum operations from Hf n to the e- typical subspace 
of the source such that 



C («)(pM)=c[ n \p)+4 n \pW), 



(50) 



where c[ n \p) ■ 

andtr {A„pi n) ) = l-5 n . Then we may derive the following 
lemma. 

Lemma 8: For any source satisfying the QAEP, any 
quantum data compression scheme and any trace- 

preserving operation A^ from Hf n to Hf n , 

\F e (pi n) ^ (n) C {n) ) - F e (pi n \^ (n) )\ < 2S n . (51) 
This lemma is an immediate consequence of Lemma |^. 

In applying this lemma, we have in mind a situation 
where A^ represents the effect of further encoding tak- 
ing us from the source Hilbert space Hf n to the channel 
Hilbert space Hf n , followed by the channel noise opera- 
tion, and a decoding which takes us back to the source 

Hilbert space. By the QAEP, for large n trA ( " ) pi™ ) A (n) = 
5 n becomes smaller than any predetermined positive <5; 
hence the difference between the entanglement fidelity 
when the encoding is preceded by quantum data compres- 
sion of the source, and the entanglement fidelity without 
such a step, is asymptotically negligible. 

For some purposes, it will be more useful to compare 
the entanglement fidelity of a source with the entanglement 
fidelity of the renormalized projection of the source onto its 



Arapi™-' A„, A„ is the projector onto T^ n \ 



\FM n \£)-Fe(p { s n) ,£)\< 



4e 



(54) 



(1-ef ■ 

By choosing n sufficiently large, e can be made arbitrarily 
small, and thus we see that for the entanglement fidelity 
for the source to be high asymptotically, it is necessary and 
sufficient that the entanglement fidelity be high asymptot- 
ically for the renormalized restriction of the source to the 
typical subspace. 

V. Entanglement fidelity and minimum pure 

STATE FIDELITY 

A. Entanglement transmission implies pure-state trans- 
mission 

We will first show that if a source satisfying the QAEP 
can be transmitted over a channel with entanglement fi- 
delity approaching one in the large-block limit, one can 
transmit a subspace which is asymptotically of dimension 
2 nS ( s ) with minimum pure state fidelity approaching one. 
That is, if a channel can send entanglement at a certain 
rate, it can send subspaces with high pure-state fidelity at 
that rate also. 

The argument has parallels with the classical argument 
that if one can transmit with low expected error (taking 
the expectation over messages), one can transmit with low 
maximal error. That argument proceeds by throwing out 
the highest-error half of the codewords, and then estab- 
lishing a definite bound on the maximum error of the re- 
maining codewords in terms of the average error of the 
initial ensemble of codewords. Both quantities go to zero 
together. Throwing out half the codewords reduces the 
rate by a bit, but asymptotically this is negligible. Here, 
we throw out a low-fidelity fraction of the Hilbert space 
dimensions, in a certain systematic way which enables us 
to bound the minimum fidelity of the remaining states in 
terms of the entanglement fidelity. 

We do not expect to be able to show that a "logarith- 
mically large" subspace of the support of an arbitrary den- 
sity operator may be sent with high minimum pure-state 
fidelity. That would mean that the capacity for sending 
subspaces with high minimum fidelity would be higher than 
the capacity for sending entanglement, since the dimension 
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of the support of a density matrix is typically much higher 
than its entropy. After all, many of the dimensions in the 
support of a density matrix may have negligible probability, 
and hence the failure to send them would be expected to 
have negligible impact on the entanglement fidelity. There- 
fore, a high entanglement fidelity would not necessarily 
suggest that all dimensions in the support, or even a log- 
arithmically large subset of them, can be sent accurately. 
Rather, we expect to be able to show that a subspace whose 
dimension is a "logarithmically large" fraction of 2 nS ^ can 
be sent with high minimum pure-state fidelity. In fact, we 
expect that a logarithmically large subspace of the typical 
subspace can be sent with high pure-state fidelity. 

Our approach, then, will be to use Lemma |^ to argue 
that if a source E generating density operators p*-™- 1 can 
be sent with asymptotically high entanglement fidelity, so 
can pi 1 ' , the renormalized restriction of p^ to its e- typical 
subspace. Therefore, for large enough n, p^ can be made 
to have entanglement fidelity greater than 1 — T) for any 
positive rj. We will indicate, without explicitly changing 

notation from p to pi , where we first use properties of 

(n) 

■ 

Suppose a density operator p with if-dimensional sup- 
port can be sent with entanglement fidelity 1 — rj. Consider 
the following procedure for systematically removing dimen- 
sions from the support of the density operator. Let |1) be 
the lowest fidelity pure state in the support. We then define 
the (sub-normalized) positive operator pi by 



Now 



pi = p-gi|l)(l| 



(55) 



where qi is the largest positive q\ for which p is still a 
positive operator. We continue this process recursively 
defining po = p, and 



- q l \i){i\ , 



(56) 



where \i) is the state in the support of pi-i with the lowest 
pure-state fidelity, and qi is as large as it can be subject to 
the constraint that p.; is a positive operator. 

The vectors in this set are ordered in terms of increasing 
pure-state fidelity; we will write fi for the pure state fidelity 
(i\£(\i){i\)\i) of 

Note that tr p~\ = 1 — (71, and in general tr pj — tr Pj-i~- 
qj = 1— 53i=i 1j- By construction, rank(/?i) = rank(p;_i) — 



1. Hence pa = and Y^f=\ <7< = !• Furthermore 



A' 



(57) 



that is, {qi, \i}} are a pure-state ensemble for p. Note that 
while this procedure removes dimensions from the support 
of the density matrix one by one, the dimensions it removes 
are not necessarily the one-dimensional spaces spanned by 
the vectors \i). Indeed, the vectors \i) will usually not be 
an orthonormal basis for the support of p although they 
are linearly independent. 



(i\p\i) =qi+J2(J\p\j)- 



(58) 



Since the terms in the sum are all positive, 

qi < (i\p\i) < Ai(p), (59) 

where Ai(p) is the largest eigenvalue of p. That is, any 
upper bound on the eigenvalues of p is also an upper bound 
on the qi. 

(n) 

In particular, when p = p\ then for large enough n the 
qi satisfy the bounds on eigenvalues from the QAEP 

2 -n(S(S)-e) 



2 -n(S(S)+e) < q . < 



1-5 



(60) 



Now, by the convexity of entanglement fidelity in the 
density operator, 

no n 

X>/i+( ^)Fe{Pn + l)>F e {p) = l-l 1 , (61) 

i— 1 z— no + 1 

where p no +i is the normalized version of p„ 0+ i, i.e., the 
density operator with the lowest-fidelity hq dimensions of 
its support removed. Define a = X)"=i & ■ Thus we are 
considering the situation where we throw out no of the 
states, leaving a fraction (1 — a) of the total weight of the 
density operator. 

We will denote by 1 — 7 the pure state fidelity of |no + 1) , 



/no+l = 1 - 7 



(62) 



this is the lowest pure-state fidelity of any of the remaining 
vectors \i) for i > no, and by construction also the lowest 
pure-state fidelity of any state in the subspace they span. 
Then 



(1 - j)a + (1 - a) > 1 



so that 



7 < -■ 

a 



(63) 



(64) 



Thus the reciprocal of a is the factor by which error is 
increased when the first no dimensions are removed from 
the support of p by the above procedure. Since 



-n(S(S)+e) 



< 



Qi > 



no 



2 -n(S(S)+e) < a 



(65) 



(66) 



Thus, for a fixed a, our procedure leaves us with a subspace 
having dimensionality D = K — uq of pure states which can 
be sent with fidelity at least 1 — rj/a. Now, 



K 

i=n + l 



(67) 
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SO 

D > (1 - a)2 n{s ~^ (68) 

and the rate 

logg> l£g(l^a) + g (s) _ e (69) 

That is, for any fixed 77 and a strictly between zero and 
one, for large enough n the size in qubits of a subspace with 
minimum fidelity 1 — -q/a approaches n(S'(S) — e). Hence 
all rates less than S("E) may be achieved. Since E was any 
density operator source that could be sent with high entan- 
glement fidelity, this implies that any rate less than the ca- 
pacity for sending entanglement may be achieved for send- 
ing subspaces with high minimum entanglement fidelity. 
Thus we have 

Theorem 1: Q s > Q e . 

B. Pure state transmission implies entanglement transmis- 
sion 

We now show that the entanglement fidelity of a den- 
sity operator under an operation cannot be too much less 
than the minimum pure-state fidelity of states in the den- 
sity operator's support. As minimum pure-state fidelity 
approaches one, so does entanglement fidelity, so that any 
density operator with support entirely in this subspace can 
be sent with high entanglement fidelity. Specifically, we 
prove the following theorem (see also jfLj). The argument 
makes no use of the notion of typical subspace, and hence 
is not limited to sources satisfying the QAEP. 

Theorem 2: Suppose all pure states in a subspace 
S have pure state fidelity (%p\£ (if)\)\ijj) greater than or 
equal to 1 — rj. Then any density operator p whose sup- 
port lies entirely in that subspace has entanglement fidelity 
F e (p,£) >l r §7?- 

For applications to asymptotic channel capacity what is 
important is that the error for sending entanglement goes 
to zero if the maximum error for density operators in the 
subspace does, and that the relationship between the two 
fidelities involves no factors of the dimension of Hilbcrt 
space, which could cause trouble in taking the large block 
limit. This means that if we can transmit Hilbert-space 
dimensions with minimum fidelity approaching one at a 
rate C, we can also reliably transmit the entanglement of 
any source E with entropy S'(E) < C. 
Proof: We Schmidt decompose \*S? R Q): 

\^ HQ )=^2^\k R )\kQ) . (70) 

In the Schmidt decomposition \k R ) and \k®) are the di- 
agonal bases of the density operators p R and p^, labeled 
according to their common eigenvalues Afc. 
Then 

p R Q' = {1®£){\^){^\) 

= ]T|fc fl )(^|®£(|fc Q )(* Q |). (71) 

kl 



The entanglement fidelity becomes (omitting the super- 
scripts R and Q to reduce clutter): 

F e (p,£) = J2 VKnXrAkXi(m\n)(l\n)(m\£(\k)(l\)\n) 

mnkl 

= ^A fc A ; (fc|£(|fc)(/|)|0 . (72) 

kl 

A first attempt at a proof might split up the sum as: 
F e = "£\l{k\£(\k){k\)\k) 

k 

+ J2 X ^i(m\k)(l\)\l) ■ (73) 

fc^Z 

We see that the first sum here can certainly be bounded 
below using the fact that pure state fidelities for vectors in 
the basis |fc) are greater than 1 — 77, but the second term 
has cross-terms that are more difficult to deal with. The 
proof will have to use the fact that not only vectors in the 
basis \k), but arbitrary superpositions of them, have high 
fidelity, and the pure state fidelities of these superpositions 
will contain such cross-terms. Since the expressions we 
want to bound contain the probabilities A, we will consider 
superpositions with amplitudes V\ and all possible phase 
factors e l ^ k : 

iVfai, ...,&)) =J2 \^e^"\k) , (74) 
fe 

The pure state fidelity for this is: 
= Yl V^nXn\kXi(m\£(\k)(l\)\n)e li ^ + ^-^-^ . 

7U7lkl 

(75) 

The m = k,n = I terms will give the entanglement fi- 
delity in the form ( [72] ) (since the phases appear in complex 
conjugate pairs in those terms, they disappear). But there 
are other terms in ( f75| ) which we need to argue are small, or 
somehow get rid of, in order to argue that the high fidelity 
of these pure states implies that the terms constituting the 
entanglement fidelity are high. We do this by averaging 
the entanglement fidelity for these superpositions over all 
phases from zero to 2ir. We still get the desired terms, but 
many of the cross terms will disappear. Only those with 
four indices identical, or with indices identical in complex 
conjugate pairs, will remain; the rest will contain integrals 
like Jq* d4>ke l ^ k (from indices whose value is not equal to 
that of some other index), or ^ d<f>ke^ k (from pairs of 
identical indices that are not complex conjugates). The av- 
erage is of course still greater than 1 — 77; and the remaining 
terms are: 

/ = £)A fc A I <fc|£(|fc><Z|)|l>+ ]T <m|£(|fc)(fc|)|m) 

kl krn,k^m 

= F e + (m\£(\k)(k\)\m)>l-r). (76) 

km,k^m 



(The same result obtains if the average is done not by in- 
tegration over all possible values of the complex phase fac- 
tor, but only over the phases ±1, ±i.) We need to upper 
bound the terms that do not appear in the entanglement 
fidelity. These terms all contain the fidelity of the output 
state £(|fc)(fc|) to a state \m) orthogonal to the input state 
\k). Since £(|fc)(fc|) has high fidelity to input k, one ex- 
pects its fidelity to states orthogonal to \k) will be small. 
In fact, since £ is trace-preserving, taking the trace in the 
\m) basis gives: 

Y J {m\£{\k){k\)\m)=l, (77) 

m 

and the fact that (k\8 (\k) (k\)\k) > 1 — r\ then gives: 

J2(m\£(\k)(k\)\m) < V . (78) 

Let us assume the eigenvalues have been ordered from 
largest to smallest, Ai = Xi(p) > X2 = A 2 (/o) etc. Then 
when k = 1, we have A m < A2, so the k = 1 term is 
bounded: 

Ax A m (m|5(|l)(l|)|m) < A X A 2 £ (m\£ (|1) (l|)|m) 

< X1X2V ■ (79) 

When k 7^ 1, we must use the looser bound A TO < Ai in a 
similar fashion, giving: 

J2 Xk J2 X m (m\£(\k)(k\)\m) 

k^l rn^k 

< 5>* a iEh*(i*><*i)m 

k^l m^k 

< ^A fe Ai7 ? = (l-Ai)Ai7 ? . (80) 



Thus 



F e >!—(! + A1A2 + (1 — Ai)Ai)?7 



(81) 



For given Ai, this is minimized where A2 = (1 — Ai). The 
resulting bound, 

F e > 1-(1 + 2A 1 (1-Ai))r ? , (82) 

is clearly minimized when Ai = A 2 = ^, giving 



Fe( P ,£)> I" -V 



(83) 



Corollary 1: Q e > Q s . 
Proof: The theorem implies, as noted in pi], that if 
there is a sequence of encodings, decodings, and subspaces 
H^ n ' that achieves rate R for subspace transmission, the 
sequence of uniform density operators on these subspaces 
/( n )/dim(i? ) wm a l so have limiting entanglement fi- 
delity one under the same transmission operations. Since 
the entropy rate of this source is R, the same rate is achiev- 
able for entanglement transmission. ■ 



C. Consequences for Capacity 

The results of the two previous sections immediately im- 
ply that the capacities for pure-state transmission and for 
entanglement transmission are equal. They also imply that 
if a source can be sent on a given channel with high entan- 
glement fidelity, so can any source with lower entropy which 
satisfies the QAEP. Hence 

Theorem 3: Any source £ with 5"(E) < C(Af) may be 
transmitted with high entanglement fidelity over the chan- 
nel Af. 

VI. Encodings 

In M], we conjectured that an expression for the quan- 
tum capacity was: 

lim max I c (p [n \N® n £ (n) ) . (84) 

jwoo H s ,p('^eH s ,£(^:B(H s )^B(H c ) 

and showed that this expression was no smaller than the ca- 
pacity. This involves a maximization over input density op- 
erators and trace-preserving completely positive encoding 
maps. However, we also conjectured that the maximization 
over encodings was not necessary. Rather, the analogous 
expression with the maximization over encodings removed, 
and the density operator maximization done over density 
operators on the channel input Hilbert space H c , instead 
of operators on the source space, was conjectured to also 
be a correct expression for the capacity. This would make 
the situation more similar to the classical one, where no 
maximization over encodings appears in the expression for 
channel capacity. In jl6) , we showed that if encodings could 
be restricted to be unitary, then indeed the maximization 
over encodings could be dropped entirely. 

A. Partially isometric encodings 

Our strategy for removing the maximization over encod- 
ings will be to show that we may restrict our attention to 
partially isometric encodings, that is, encodings of the form 



£{p) = VpV\ 



(85) 



where V is a partial isometry from the source space to 
the channel space. An encoding corresponding to a partial 
isometry from a source space to a smaller channel space (as 
in noiseless data compression, for instance) will be trace- 
decreasing for density operators having support outside the 
subspace that is unitarily mapped into the source space. In 
our definition of channel capacity, we required that encod- 
ings be trace-preserving. But trace-decreasing encodings 
are relevant to our problem because they may be embed- 
ded in trace-preserving ones with no loss of fidelity. We 
say a trace-decreasing operation T is embedded in a trace- 
preserving operation A if 

A = T+G (86) 

for some trace-decreasing Q. Since 

F e (p,T + g) = F e (p,T) + F e (p,g) , (87) 
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the entanglement fidelity of a trace-decreasing operation is 
a lower bound on the entanglement fidelity of any trace- 
preserving operation into which the trace-decreasing one 
has been embedded. This is what makes partially isometric 
encodings relevant to physical situations in which a trace- 



preserving encoding is used; we will use this in IX 



VII. Restricting the encodings 

We will show that if there exists a general encoding that 
achieves high fidelity transmission for a given source, there 
is also a partially isometric encoding achieving fidelity not 
much lower for that source, where "not much lower" will 
be quantified in such a way that if the general encoding has 
fidelity approaching one, the lower bound on fidelity with 
partially isometric encoding also approaches one, and there 
is no dimensional dependence in the relation between the 
fidelities that would cause difficulty with the large block 
limit (in which the Hilbert space dimension grows expo- 
nentially) . 

A. Perfect transmission 

The intuition behind the argument may be illustrated 
for the case of transmission with fidelity precisely one. It 
is frequently easy to show something for fidelity exactly 
one, but more difficult to extend it to fidelities which are 
merely very close to one, as is necessary for channel ca- 
pacity arguments, and that is the case here. If the oper- 
ation of encoding followed by noise followed by decoding 
achieves perfect transmission for some p, this implies that 
the encoding operation is perfectly reversible for p, since it 
is reversed by the composition of noise with decoding. As 
noted in |l2| , an operation A that is perfectly reversible for 
a density operator may, when restricted to the subspace C 
(with dimension dc) supporting that density operator, be 
written in the form of unitaries from the support into mu- 
tually orthogonal dc-dimensional subspaces of the output 
space, randomly applied with probabilities pi. That is 



A ~ {^/vlUi} , 
Ufa, = \J'< 



where Pc is the projector onto C. If there exists an opera- 
tion which reverses this with perfect fidelity for some input 
p, it must reverse each of the unitaries Ui with fidelity one. 
Hence we may remove the factor ^fpl from any of the oper- 
ators in the canonical decomposition of the encoding, and 
use it as an encoding, which will achieve perfect transmis- 
sion when the same decoding is used. 

B. Isometric encoding suffices 

Theorem j±: Given a trace-preserving map A and a map 
S with tr £{p) = 1 and 



F e (p,AS) >l-r), 
there exists a partial isometry W such that 
F e {p,AW) > I-277 . 



In applying this theorem, we will take £ to be the en- 
coding map, and A to be the concatenation of noise and 
decoding. 

The proof proceeds via the following two lemmas: 
Lemma 10: There exist operator decompositions of A 



and £ such that F e (p,A£) < F e (p, A X E X I Jti [E iP e\)) 



(Note that Ei/ytr (EipE{) is not necessarily trace de- 
creasing.) 

Proof: Let {A;} and {E{\ be operator decompositions 
of A and £. Let X be the matrix with elements tr (AiEjp). 
Then F e — Ylij\(X)ij\ 2 - The singular value decompo- 
sition ensures that by changing the operator decomposi- 
tion of A and £ , we can transform to a representation 
where X is diagonal; assume without loss of generality 
that Ai and Ej are already such representations. Then 
F e (p, AS) = E k tr (A k E k p) 2 (since tr (A k E jP ) = if k ^ 
j). Let A fe = tr (E k pEl). Then £ fe A fc (tr (A k E k p) 2 /X k ) = 
F e (p, AS) and J2 k X k = 1, so there exists a k such that 
tr (A k E k p) 2 /\ k > F e (p,AS). M 

Lemma 11: Let 



E 
A 



S- 

c 



c 

- 5 



(91) 



be linear operators, p € B(C) a density matrix. If 
F e (p, AE) > 1 - 77, A^A < I and tr (EpE^) = 1, then 
there is a maximal partial isometry W : S — > C such that 
F e (p,AW) > 1 - 27?. 

Proof: Let UDaV be a singular value decomposition of 
A. Here we can take Da to have matrix elements propor- 
tional to the Kronecker delta in a (fixed) basis for C , V 
unitary on C and U : C — > S a maximal partial isometry. 
Consider the maximal partial isometry W : S — > C defined 
by W = V*>W. Then U = {VWY and 



|tr {AEpf 



[tr (p^UD^VWUD'fVEpV 2 )] 2 



< tr ((UD A /2 VWy P UD A /2 VW) 



xtr {UD A /2 VEpE\UD A /2 V) ] ) (92) 

< tr {UD A /2 VW{VW)^D^ 2 U ] p) (93) 
= tr (UD A U^p) 

= tr (UD A VWp). (94) 

(The first inequality is an operator Schwarz inequality, 
while the second is due to the fact that A^A, and therefore 
Da, is less than or equal to /, and the fact that if B > 
and / > C > 0, tr BC < tr B.) It follows that tr (AW p) > 
1 - 77, hence |tr {AWpW = F e (p, AW) > 1 - 2tj. ■ 
To obtain Theorem Q as a corollary, apply Lemma [To| 
and the premise of the theorem to get: 



(89) F e (p,A 1 E 1 / x /tT (EipEl)) > F e (p,AS) > 1 - 77 
Lemma [ll], with 

(90) E = Br/tr {E lP E\) 



(95) 



(96) 
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then gives the result. 

It follows that if there exists a coding scheme which 
transmits the entanglement of a source reliably, there exists 
a coding scheme using partially isometric encodings which 
transmits it reliably. 

VIII. Forward classical communication doesn't 

HELP 

Bennett, DiVincenzo, Smolin and Wootters (BDSW) 
[0 showed that a forward classical channel, from encoder 
to decoder, cannot help one achieve perfect transmission. 
They did this by constructing, from any fidelity-one cod- 
ing scheme for such a channel, a fidelity-one coding scheme 
which makes no use of the classical channel. This is another 
example of a result which is apparently hard to extend to 
asymptotically high fidelity transmission. 

An argument virtually identical to the proof of Theorem 
H can be used to extend their result to the asymptotically 
high fidelity situation, for the problem of sending the en- 
tanglement of a uniform source with asymptotically high 
fidelity (cf. also 0)). (By results in Section (M), this is 
equivalent to the problem BDSW considered, of sending 
every state in a source space with high fidelity.) We may 
do this by modeling a classical forward channel in a man- 
ner analogous to the model of the observed channel in jl6) 
except that the decoder takes into account classical infor- 
mation about the encoding rather than the noise. We take 
the encoding to be a set {£ m } of trace-nonincreasing op- 
erations which sum to a trace-preserving operation. The 
value of the index m represents classical information avail- 
able to the encoder (as a measurement result, say) which 
may be sent to the decoder and used in decoding, so we al- 
low the decoder to use one of a collection of trace-preserving 
decodings T> m . Formally, we define a coding scheme with 
classical forward communication to be a sequence of such 
collections of encodings and decodings, [£m\v^m], where 
m takes values 1...M'"' so that the number of available 
encoding operations may depend on n, and 



gin) 



(97) 



is trace-preserving, while each of is trace- 

nonincreasing. We say a source may sent reliably over this 
channel with classical forward communication, if there ex- 
ists a coding scheme such that 



lim ^F e (p("),^)^ n 4 n) ) = l 

r>. — tor; '— 



(98) 



We define the capacity of a channel for entanglement trans- 
mission with forward classical communication, Q e , to be 
the supremum of the entropy rates of sources that can be 
sent reliably on the channel. 

Now suppose the entanglement fidelity of a density op- 
erator p sent through such a channel is high (omit the su- 
perscripts for clarity), 



J2F e (P,V m JV£ m ) >l-r). 



(99) 



Then there exists a value of j of the index m for which 

FefaVjAfEj) > 1-7]. (100) 

Now 



F e {p,V 3 N£ 3 ) = F e {p,V ] M 



£, 



tr£» 



) , (ioi) 



so that by Theorem [l], there is a partial isometry W such 
that 



F e (p, VjNW) > l-2 ?? 



(102) 



The partially isometric encoding can be extended to a 
trace-preserving encoding with no loss of fidelity, hence the 
same source can be sent without using the forward classical 
channel. Hence 



Theorem 5: Q 



(/c) 



IX. AN UPPER BOUND ON CAPACITY 



We will now treat an issue raised in section VI- A, We 



will show that the fact that partially isometric encodings 
suffice to achieve the channel capacity implies that we may 
omit the maximization over encodings from the expression 
that upper bounds the capacity. 

Since the entanglement fidelity of any trace-preserving 
encoding into which a (possibly trace-decreasing) partially 
isometric encoding V might be embedded is bounded be- 
low by the unrenormalized entanglement fidelity of V, we 
consider trace-preserving encodings T = V + A, where V 
is partially isometric. Polar decompose V into a maximal 
partial isometry W and a positive T (which will be a pro- 
jector), so that V = WT. 

We know from Theorem given a sequence of gen- 

eral encodings £ and decodings T> that sends a given source 
(so that overall entanglement fidelity goes to one with in- 
creasing block size) , there exists a sequence of partially iso- 
metric encodings that (when used with the same de- 
codings as before) , sends that source with the unrenormal- 
ized entanglement fidelity approaching one with increasing 
block size. Hence the entanglement fidelity when some se- 
quence of trace-preserving extensions T^ 1 = V'"' + A^ 
is used to encode goes to one with increasing block size as 
well. More precisely, if for a given e and large enough n, 
we have 



F e (p {n \V {n W® n £ {n) ) > 1 - e 
then by Theorem |] for large enough n 

F e (p("),X>(")A^ n J" (n) ) > 1 - 2e 



(103) 



(104) 



Now let us consider the fidelity of the output 
states p RQ " = V^W^^ip^) and a RQ " = 
V^N® n T (n) (p {n) ) obtained by using the different encod- 
ings. By Lemma 0, 



F(p RQ " ,a RQ ") >l-3e 



(105) 



To obtain the upper bound on capacity in |lq ], we used 
the following fact: 
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S(p) < I c {p {n \N® n £ {n) ) + 2 
+4(1 - F e {p,V^M® n £ {n) ))\ogd n c , 



(106) 



where all operations involved are trace-preserving (d c is the 
dimension of the channel). In particular, this holds for the 
operations T 1 ^ . We now consider the coherent information 
with such encoding operations. 

Recall the representation of the coherent information I c 
as a conditional entropy and apply Lemma [?], the continuity 
of conditional entropy, to obtain: 



\I c {p {n \N® n £ (n) ) 



Hence 



lim 



max 

„(») 



I c { P {n \N® n T {n) )\ 
< 6v / 3^1ogd™ + 2. 



(107) 



max ■ 

„0») 



I c {p( n \N® n T {n) ) 



|=0. (108) 



So, the coherent information bound with general encodings 
is the same as the bound for encodings restricted to have 
the form T . 

We now show that this bound implies that with the max- 
imization over channel input density operators alone. In 
earlier work JL6| , we defined the coherent information of a 
non-trace-preserving operation as: 



Io(j>,S) = S 



tr £{p) 



- S 



1®£(\^ RQ ){^ RQ \)\ 



tv£(p) 



,(109) 



the conditional entropy using the renormalized output state 
of the system and entangled reference. Now, the coher- 
ent information of the channel T is bounded above by the 
coherent information of the observed channel A/"{V, A} in 
which we know which of V and A occurred. The latter is 
given M by: 



I c {p,M® n {V,A\) = (tr Tp)I c {p,N® n V) 
+(l-tr Tp)I c {p,N® n A) . 



(110) 



A straightforward calculation shows that the first term is 
equal to 



tr TpI c {WTpTW^N® n ) 



(111) 



(We use the notation A = A/tr A.) Since 1 > tr Tp > 
F e (p, T>N® n V) and the latter approaches one in the large- 
n limit, so does tr Tp, and hence: 



S(E) < lim 



I c {WTpTW^N® n ) 



(112) 



The inequality still holds when we maximize over p. The 
ability to maximize over p followed by projection using T, 



normalization, and placing the density operator in some 
subspace of the channel via W just allows us to access some 
of the possible channel input density matrices. Hence the 
RHS of (112) is bounded above by: 



lim max 



I c {p( n \N® n ) 



(113) 



This is the promised upper bound on the quantum capacity. 
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